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Abstract 

Nowadays,  in  several  areas,  efficient  fault  diagnosis  methods 
for  complex  machinery  and  equipments  are  required.  Sev¬ 
eral  fault  diagnosis  methods  based  on  different  theories  and 
approaches  have  been  proposed  in  the  literature.  In  general, 
these  methods  use  mathematical/statistical  models,  accumu¬ 
lated  experience,  or  even  process  historical  data  to  perform 
fault  diagnosis.  Although  methods  based  on  models  or  expe¬ 
rience  have  shown  to  be  effective,  they  have  the  disadvantage 
of  requiring  previous  knowledge  of  the  dynamic  system  in 
question.  On  the  contrary,  methods  based  on  process  histor¬ 
ical  data  do  not  require  a  prior  knowledge,  they  are  based 
solely  on  data  obtained  directly  from  the  dynamic  system. 
The  application  of  so-called  “Evolving  Intelligent  Systems” 
to  accomplish  fault  diagnosis  from  process  data  have  been 
shown  a  promising  approach.  This  paper  proposes  an  evolv¬ 
ing  fuzzy  classifier  based  on  a  new  approach  that  combines 
a  recursive  clustering  algorithm  and  a  drift  detection  method 
and  its  application  on  dynamic  systems  fault  diagnosis.  The 
novel  approach  provides  greater  robustness  to  outliers  and 
noise  present  in  data  from  process  sensors.  The  classifier  is 
evaluated  in  fault  diagnosis  of  an  interacting  tank  system  and 
the  results  are  promising. 

1.  Introduction 

Nowadays,  the  advance  of  technology  has  resulted  in  the  emer¬ 
gence  of  machinery  and  complex  equipments,  which  impose 
great  challenges  for  its  management  and  maintenance.  In 
many  industries,  for  instance,  fault  diagnosis  in  major  pro¬ 
cesses  is  vitally  important  to  assure  normal  operation  of  a 
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plant  and  avoid  economic  losses,  security  reductions  and  en¬ 
vironmental  damages.  This  context  led  to  the  emergence  of 
new  concepts  on  management  and  maintenance  of  machin¬ 
ery  and  equipments,  such  as  Condition-Based  Maintenance 
(CBM).  In  CBM,  machine  or  equipment  data  obtained  in  real 
time  are  used  to  infer  its  working  condition  (or  faulty  condi¬ 
tion),  allowing  maintenance  scheduling  and  preventing  equip¬ 
ment  crashes.  Another  concept  has  emerged  based  on  CBM, 
the  concept  of  intelligent  maintenance  (Vachtsevanos,  Lewis, 
Roeme,  Hess,  &  Wu,  2006). 

In  past  decades  several  fault  diagnosis  methods  based  on  dif¬ 
ferent  approaches  have  been  proposed  in  the  literature.  These 
methods  use  mathematical  models,  statistical  models,  accu¬ 
mulated  experience,  or  process  historical  data  to  perform  fault 
diagnosis  (Venkatasubramanian,  2005).  Fault  diagnosis  meth¬ 
ods  based  on  process  historical  data  have  received  great  em¬ 
phasis  recently  (Abellan-Nebot  &  Subiron,  2010)  and  several 
works  have  already  proposed  data  based  diagnostics  methods 
employing  intelligent  systems,  mainly  artificial  neural  net¬ 
works  and  fuzzy  systems  (Jardine,  Lin,  &  Banjevic,  2006). 
Nevertheless,  despite  the  good  performance  achieved  by  in¬ 
telligent  systems  in  fault  diagnosis,  they  tend  to  face  difficul¬ 
ties  when  the  problem  involves  complex  non-stationary  dy¬ 
namic  systems.  In  this  systems,  physical  parameters,  operat¬ 
ing  characteristics  and  fault  behaviours  change  over  time,  re¬ 
quiring  an  adaptive  fault  diagnosis  system,  able  to  self-adapt 
to  cope  with  changes  in  the  monitored  system.  In  order  to 
address  fault  diagnosis  in  this  cases,  some  works  propose  the 
use  of  so-called  “Evolving  Intelligent  Systems”  (Lughofer  & 
Guardiola,  2008;  Filev,  Chinnam,  Tseng,  &  Baruah,  2010; 
Lemos,  Caminhas,  &  Gomide,  2013). 

Based  on  artificial  neural  networks,  fuzzy  inference  systems 
or  a  combination  of  both,  the  neurofuzzy  networks,  the  evolv- 
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ing  intelligent  systems  are  systems  whose  main  characteristic 
is  the  ability  to  gradually  determine  both  its  structure  and  pa¬ 
rameters  from  input  data  acquired  in  online  mode  and  often  in 
real  time.  Evolving  intelligent  systems  applications  has  been 
growing  in  recent  years.  Many  authors  have  obtained  suc¬ 
cessful  applications  in  real  world  complex  problems  involv¬ 
ing  modeling,  control,  classification  or  prediction  (Angelov, 
Filev,  &  Kasabov,  2010).  Evolving  clustering  algorithm  is  the 
most  widely  used  approach  to  define  the  structure  of  an  evolv¬ 
ing  intelligent  system  (Kasabov  &  Song,  2002;  Angelov  & 
Filev,  2003;  Leng,  McGinnity,  &  Prasad,  2005;  Rong,  Sun- 
dararajan,  Huang,  &  Saratchandran,  2006;  Lughofer,  2008; 
Soleimani-B.,  Lucas,  &  Araabi,  2010;  Lima,  Hell,  Gomide, 
&  Ballini,  2010;  Lemos,  Caminhas,  &  Gomide,  2011).  This 
algorithms  generally  adopt  a  mechanism  to  update  the  struc¬ 
ture  (creation/modification/removal  of  clusters)  and  parame¬ 
ters  of  the  system  using  some  measure  of  similarity  between 
input  data  samples  and  existing  clusters.  This  mechanism 
may  lead  to  an  erroneous  definition  of  the  structure,  since 
outliers  or  noisy  samples  (as  usually  are  the  data  acquired  by 
sensors  in  industrial  environments)  which  exceeds  the  mea¬ 
sure  of  similarity  can  generate  clusters  that  do  not  effectively 
represent  the  data  spacial  structure  (Lemos  et  al.,  201 1). 

In  fault  diagnosis  problems,  the  use  of  evolving  intelligent 
systems  based  on  recursive  clustering  algorithms  robust  to 
outliers  and  data  noise  is  mandatory.  In  this  case,  each  new 
cluster  created  is  usually  associated  with  a  new  faulty  condi¬ 
tion.  Thus,  if  the  clustering  procedure  is  not  robust,  the  fault 
diagnosis  model  tends  to  have  a  high  false  alarm  rate,  i.e., 
new  faulty  conditions  are  erroneously  detected.  Considering 
this  context,  this  paper  proposes  a  fault  diagnosis  approach 
based  on  an  evolving  fuzzy  classifier  which  uses  a  new  ro¬ 
bust  unsupervised  recursive  clustering  algorithm.  The  unsu¬ 
pervised  recursive  clustering  algorithm  classifier  consists  of  a 
modified  version  of  the  Gustafson-Kessel  (GK)  clustering  al¬ 
gorithm  (Gustafson  &  Kessel,  1979)  with  the  incorporation  of 
the  Drift  Detection  Method  (DDM)  (Gama,  Medas,  Castillo, 
&  Rodrigues,  2004). 

Considered  a  powerful  clustering  algorithm,  GK  clustering 
algorithm  unlike  many  others  allows  the  identification  of  clus¬ 
ters  with  different  shapes  and  orientations  in  space.  The  al¬ 
gorithm  employs  a  technique  to  adapt  the  distance  metric  to 
the  shape  of  each  cluster  using  a  estimation  of  the  cluster  co- 
variance  matrix.  Furthermore,  the  algorithm  has  also  the  ad¬ 
vantage  of  being  relatively  insensitive  to  data  scale  and  ini¬ 
tialization  of  the  partition  matrix  (Filev  &  Georgieva,  2010). 
Drift  detection,  according  to  the  literature,  is  a  method  to  de¬ 
tect  gradual  changes  in  the  context  of  input  data.  By  context, 
it  is  understood  as  a  set  of  generated  data  when  the  process 
is  stationary.  Drift  detection  methods  are  suitable  for  appli¬ 
cations  involving  machine  learning,  where  algorithms  are  ap¬ 
plied  to  real  world  problems,  in  complex,  non-stationary  and 
dynamic  environments  (Sebastiao  &  Gama,  2009).  Among 


several  methods  proposed  for  drift  detection,  the  DDM  algo¬ 
rithm  employs  simple  and  computationally  efficient  method 
to  detect  moments  when  changes  occur  and  it  can  be  embed¬ 
ded  into  any  learning  algorithm,  increasing  its  efficiency  in 
problems  involving  non-stationary  dynamic  models. 

In  this  paper,  a  new  unsupervised  recursive  clustering  algo¬ 
rithm  is  proposed,  where  any  clustering  update  depends  not 
only  on  the  similarity  measure,  but  also  on  the  monitoring 
changes  in  the  input  data  flow,  which  gives  the  algorithm  a 
greater  robustness  to  the  presence  of  outliers  and  noise.  A 
merging  cluster  mechanism  was  also  incorporated  into  the 
algorithm  to  enable  the  removal  of  redundant  clusters.  The 
fuzzy  rule  base  of  the  proposed  classifier  is  updated  when¬ 
ever  the  cluster  structure  is  modified.  The  clusters  centers  and 
covariance  matrices  are  used  as  parameters  of  fuzzy  rules. 
Multivariate  Gaussian  membership  functions  are  employed 
in  the  rules  to  avoid  information  loss  when  there  is  interac¬ 
tion  between  input  variables.  Regarding  the  characteristics  of 
the  proposed  recursive  clustering  algorithm,  the  main  bene¬ 
fits  achieved  by  the  classifier  used  in  this  work  are;  1)  the 
ability  to  learn  the  dynamic  system  model  in  online  mode 
and,  if  necessary,  in  real  time;  2)  the  ability  to  adapt  when¬ 
ever  changes  are  detected  in  the  monitored  system,  allowing 
the  application  to  real  problems;  3)  low  false  alarm  rate  and 
high  fault  isolation  rate  due  to  the  robustness  to  outliers  and 
noise,  increasing  the  reliability  of  diagnosis.  To  evaluate  the 
performance  of  the  proposed  approach  in  fault  diagnosis,  an 
interacting  tank  system  simulator  was  used  to  simulate  nor¬ 
mal  and  several  faulty  conditions.  Outliers  and  noise  were 
added  to  the  simulated  data  to  evaluate  the  robustness  of  the 
proposed  algorithms. 

After  this  introduction,  the  rest  of  the  paper  proceeds  as  fol¬ 
lows.  Section  2  presents  the  theoretical  concepts  regarding  re¬ 
cursive  clustering  algorithm,  drift  detection  method  and  pre¬ 
sents  the  proposed  recursive  clustering  algorithm.  Next,  Sec¬ 
tion  3  presents  the  proposed  classifier  and  its  application  in 
fault  diagnosis.  Section  4  presents  the  simulations  and  results. 
Finally,  Section  5  presents  the  conclusion  and  suggestions  for 
future  works. 

2.  Theoretical  Concepts:  Recursive  Cluster¬ 
ing  Algorithm  and  Drift  Detection 

2.1.  Recursive  Gustaffson-Kessel  Algorithm 

Clustering  algorithms  are  among  the  most  useful  tools  to  solve 
pattern  recognition  problems,  where  involves  analysis  of  non- 
labeled  data,  or  unsupervised  learning  (Duda,  Hart,  &  Stork, 
2001).  Over  the  past  decades,  thousands  of  clustering  al¬ 
gorithms  have  been  proposed  (Jain,  2010).  GK  algorithm, 
unlike  many  clustering  algorithms  that  employ  Euclidian  dis¬ 
tance  as  measure  of  similarity,  employs  Mahalanobis-like  dis¬ 
tance,  which  allows  the  identification  of  clusters  with  ellip¬ 
soidal  shapes.  In  this  algorithm  the  distance  is  defined  as  fol- 
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lows: 

Vi)A{{Xk  Vi)  (1) 

where  dfk  represents  the  distance  between  an  input  data  sam¬ 
ple  Xk  =  [xki, xkn\,  k  =  1, TV,  and  the  cluster  center 
Vi,  i  =  1, c,  where  TV  is  the  number  of  data  samples,  n  is 
the  number  of  data  dimensions,  and  c  is  the  number  of  clus¬ 
ters.  The  norm-inducing  matrix  Ai,i  =  1,  ...,c,  defines  the 
shape  and  orientation  of  each  cluster  in  space.  An  iterative 
process  is  used  in  the  GK  algorithm  to  estimate  the  param¬ 
eters  of  the  clusters  (the  cluster  center  and  fuzzy  covariance 
matrix).  This  process  is  finished  when  a  certain  convergence 
criterion  is  reached.  An  extended  version  of  the  GK  algo¬ 
rithm  named  evolving  GK-like  algorithm  (eGKL)  is  proposed 
in  Filev  and  Georgieva  (2010).  This  approach  estimates  the 
number  of  clusters  and  performs  the  adaptation  of  its  param¬ 
eters  recursively,  maintaining  the  advantages  of  the  GK  algo¬ 
rithm.  To  evaluate  the  similarity  between  a  new  sample  data 
and  one  of  the  existing  clusters,  the  eGKL  algorithm  employs 
the  Mahalanobis  distance,  defined  as  follows: 

Dik  =  ( xk  -  vi)F~1(xk  -  Vi)T  (2) 

where  Fi,i  =  1, ...,  c  is  a  covariance  matrix.  Thus,  the  cur¬ 
rent  data  sample  belongs  to  an  existing  cluster  if  the  distance 
to  the  cluster  center  is  smaller  than  the  cluster  radius.  The 
eGKL  algorithm  uses  an  approach  inspired  in  concepts  of  sta¬ 
tistical  process  control  to  estimate  the  radius  of  each  cluster. 
In  this  approach,  it  is  assumed  that  a  sample  belongs  to  a  clus¬ 
ter  if  the  following  relationship  holds: 

Dik  <  Xn,p  (3) 

where  \n  p  is  the  value  of  a  Chi-squared  distribution,  n  is 
the  degrees  of  freedom  and  3  is  the  confidence  interval.  The 
degrees  of  freedom  n  correspond  to  the  input  space  dimen¬ 
sion  and  confidence  interval  /3  is  a  parameter  of  the  algo¬ 
rithm.  This  approach  has  the  advantage  of  avoiding  the  prob¬ 
lem  called  “curse  of  dimensionality”  (Hastie,  Tibshirani,  & 
Friedman,  n.d.),  i.e.,  the  problem  of  increasing  the  distance 
between  two  adjacent  points  with  the  increase  in  the  input 
space  dimensionality,  since  \n  p  is  proportional  to  the  dimen¬ 
sion  of  the  input  data.  If  the  condition  given  by  Eq.  (3)  is 
satisfied,  it  means  that  the  current  data  sample  belongs  to  a 
cluster,  so  the  cluster  parameters  are  updated.  Otherwise,  it  is 
assumed  that  the  current  data  sample  does  not  belong  to  any 
one  of  the  existing  clusters,  and  a  new  cluster  is  created.  The 
complete  procedures  of  the  eGLK  algorithm  can  be  seen  in 
Filev  and  Georgieva  (2010). 

2.2.  Drift  Detection  Method 

In  the  literature,  several  drift  detection  methods  have  been 
proposed.  In  general,  they  can  be  classified  into  two  cat¬ 
egories:  methods  that  perform  adaptive  learning  at  regular 
intervals  regardless  of  the  occurrence  of  changes,  and  meth¬ 


ods  that  detect  changes  first  and  subsequently  adapt  the  learn¬ 
ing  to  these  changes  (Sebastiao  &  Gama,  2009).  Belonging 
to  the  second  category,  the  DDM  algorithm  employs  a  sim¬ 
ple  method  with  direct  application.  This  method  is  based 
on  monitoring  the  number  of  errors  produced  by  a  learning 
model  during  prediction.  The  method  uses  the  Binomial  dis¬ 
tribution  to  determine  the  general  form  of  the  probability  for 
the  random  variable  that  represents  the  number  of  prediction 
errors  into  a  sequence  of  n  input  data  samples.  In  DDM  al¬ 
gorithm,  for  each  k  data  sample  sequences,  the  error  rate  is 
the  probability  of  the  prediction  error  pk  with  standard  devia¬ 
tion  Sfc  =  \/pk(  1  —  pk)/k.  According  to  the  Probability  Ap¬ 
proximately  Correct  (PAC)  learning  model  (Mitchell,  1997), 
the  error  rate  of  the  learning  algorithm  decreases  with  the  in¬ 
crease  of  input  data  samples,  and  if  the  distribution  is  station¬ 
ary,  a  significant  increase  in  the  error  rate  suggests  context 
changes.  In  this  case,  it  is  assumed  that  the  current  model 
is  inappropriate  and  should  be  updated.  In  DDM  algorithm, 
while  monitoring  the  error,  it  defines  a  warning  and  a  drift 
level.  When  pk  +  sk  exceeds  the  warning  level,  the  data  sam¬ 
ples  are  stored  in  memory.  However,  if  pk  +  sk  exceeds  the 
drift  level,  it  is  considered  that  there  is  a  context  change.  In 
this  situation,  the  model  induced  by  the  learning  algorithm 
should  be  updated  with  the  data  samples  stored  since  the  time 
that  the  warning  level  has  been  reached.  It  is  possible  that  the 
error  increases  and,  after  reaching  the  warning  level,  it  de¬ 
creases  to  lower  levels.  This  situation  corresponds  to  a  false 
alarm,  where  there  is  no  change  of  context  and,  therefore,  no 
action  is  required  and  the  data  samples  stored  in  the  memory 
are  no  longer  needed.  More  details  about  the  DDM  method 
can  be  found  in  Gama  et  al.  (2004). 

2.3.  Proposed  Recursive  Clustering  Algorithm 

The  algorithm  proposed  in  this  work  consists  of  an  unsuper¬ 
vised  recursive  clustering  algorithm  with  a  new  mechanism  of 
clustering  update.  The  algorithm  is  a  recursive  version  of  the 
GK  algorithm,  inspired  by  the  eGKL  algorithm,  and  incorpo¬ 
rating  the  DDM  algorithm.  Thus,  clustering  is  performed  in 
online  mode  and,  if  necessary,  in  real  time. 

Considering  that  there  is  no  a  priori  information  about  the 
clustering  structure  neither  a  initial  set  of  input  data  samples, 
the  proposed  algorithm  starts  by  associating  the  center  of  the 
first  cluster  cp  to  the  first  input  data  sample  x\.  The  cor¬ 
responding  covariance  matrix  F\,  the  learning  rate  a\  and 
the  number  of  samples  associated  with  the  first  cluster  M\ 
are  defined  as  follows:  C\  =  X\\  F-t  =  Finit ;  = 

on nit',  Mi  =  1,  where  Finn  =  7/;  /  is  an  identity  ma¬ 
trix  of  n  size;  7  is  a  small  positive  number  (default  value: 
7  =  10~2)  and  ainit  £  [0, 1]  is  the  initial  learning  rate  (de¬ 
fault  value:  arnlt  =  0.5).  If  all  data  samples  are  processed, 
the  algorithm  stops,  otherwise,  a  new  input  data  sample  xk 
is  obtained  and  the  distance  between  the  data  sample  and  the 
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centers  of  the  existing  clusters  is  computed  as: 

Dik  =  ( -  vi)F~1{xk  -  Vi)T  (4) 

The  similarity  between  the  current  data  sample  and  the  exist¬ 
ing  clusters  is  verified  by  the  similarity  condition: 


t~\  2  2 

J-'ik  ^  Xn,/ 3 


(5) 


where  \n  p  is  the  value  of  a  Chi-squared  distribution,  n  is  the 
degrees  of  freedom  and  /?  is  the  confidence  interval.  The  de¬ 
grees  of  freedom  n  correspond  to  the  input  space  dimension 
and  confidence  interval  f3  is  a  parameter  of  the  algorithm.  If 
similarity  condition  given  by  Eq.  (5)  is  met  for  a  cluster,  it  is 
assumed  that  the  current  sample  belongs  to  this  cluster.  The 
cluster  parameters  (center,  covariance  matrix,  learning  rate 
and  number  of  samples  in  the  cluster)  are  then  updated  as  fol¬ 
lows: 


Fq 


vq  =  vq  +  aq(xk  -  vq) 

Fq  +  aq((xk  -  vq)T (xk  -  vq)  -  Fq) 


Mq  —  Mq  +  1 


(6) 

(7) 

(8) 
(9) 


where  q  =  arg  min (D?k).  If  the  similarity  condition  given  by 

Eq.  (5)  is  not  met,  it  is  assumed  that  the  current  sample  does 
not  belong  to  any  existing  cluster.  Then,  the  algorithm  incre¬ 
ments  a  variable  that  represents  the  number  of  dissimilarities, 
Mdis  =  Mdis  +  1,  then  the  error  probability  and  standard 
deviation  are  computed  as  follows: 


P  = 


k 


(10) 


s  =  y/p(l  -p)/k  (11) 

In  this  algorithm,  the  p  and  s  values  are  stored  whenever  p  +  s 
reach  the  lowest  value  dining  the  process,  obtaining  pmin  and 
smin ■  If  the  following  condition  is  met: 


as: 


P  +  S  >  Pmin  +  Z2  •  S  min  (14) 


where  z^  is  the  drift  level  (default  value:  =  3).  If  the 

drift  level  is  reached,  a  new  cluster  is  created,  c  =  c  +  1, 
and  the  center  and  the  covariance  matrix  of  the  new  cluster 
are  determined  by  the  samples  stored  in  the  data  window  as 
follows: 

m 

vc  =  —  ^2  W(data)j  (15) 

3  = i 

Fc  =  cov  (W(data)j)  (16) 

The  remaining  parameters  of  the  new  cluster  (learning  rate 
and  number  of  samples  in  the  cluster)  are  initialized  as:  ac  = 

® init i  Mc  1. 


In  order  to  avoid  redundant  cluster  formation,  during  the  up¬ 
date,  the  similarity  between  clusters  is  checked.  To  that  end, 
distances  between  the  centers  of  the  clusters  are  computed  as 
follows: 

Dij  =  (Vi  -  v^Fr1^  -  Vj)T  (17) 

Dji  =  (vj  -  vi)F~1{vj  -  Vi)T  (18) 

If  one  of  the  following  similarity  conditions  is  met  for  two 
existing  clusters  i  and  j. 


Dij  <  Xn,p 

(19) 

Dji  <  xl,p 

(20) 

the  clusters  are  merged.  These  clusters  have  a  hyper  ellip¬ 
soidal  shape,  defined  by  a  mean  vector,  a  covariance  matrix, 
and  a  number  of  samples  associated  with  each  one.  The  com¬ 
bination  of  these  two  clusters  produce  a  new  one  with  param¬ 
eters  computed  as  follows  (Kelly,  1994): 


Mi  —  Mi  +  Mj 


(21) 


Vi  = 


Mi 

Mi  +  MjVi 


Mj 

Mi  +  MjVj 


(22) 


P  S  <C.  Pmin  H-  §min  (12) 

then  Pmin  =  p  and  smin  =  s.  Note  that,  when  algorithm 
starts,  the  p  and  s  values  must  be  initialized  as  a  positive 
number,  it  is  suggested  set  as  one  for  each  value.  To  decide 
whether  the  current  data  sample  xk  represents  a  new  cluster 
or  it  is  just  an  outlier,  warning  and  drift  conditions  are  evalu¬ 
ated.  The  warning  condition  is  verified  as: 


Mi  -  1 

Fi  =  *  - rf, 


Mj 

Mi  +  Mj  +  1  *  1  '  Mi  +  M,  +  1  3 


MiMj 


Mi  +  Mj  {Mi  +  Mj  -  1) 


(vi  -  Vj)T(vi  -  Vj)  (23) 


Algorithm  1  summarizes  the  proposed  recursive  clustering  al¬ 
gorithm. 


P  T  £  >  Pmin  T  ~1  '  Smin  (13) 

where  z\  is  the  warning  level  (default  value:  z\  =  2).  If 
the  warning  level  is  reached,  then  the  current  data  sample 
is  stored  in  a  window  of  samples  W(data)j,  j  =  1 
(where  m  is  the  current  size  of  the  window)  and  then,  the 
drift  condition  is  evaluated.  Otherwise,  the  algorithm  pro¬ 
cesses  the  next  input  data  sample.  Drift  condition  is  verified 


3.  Proposed  Evolving  Fuzzy  Classifier  for  Fault 
Diagnosis 

In  many  current  applications,  the  use  of  algorithms  for  pattern 
classification  is  present,  such  as  fingerprint  recognition  for  se¬ 
curity  systems,  handwriting  recognition  on  touch  screen  com¬ 
puters,  DNA  sequences  identification  in  medical  diagnostic 
softwares  and  fault  diagnosis  in  industrial  equipments.  In  this 
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Algorithm  1:  Recursive  Clustering  Algorithm  with  Drift  De¬ 
tection 


Input:  X k,  Xn,pi  Finit ,  Otinit ,  Z 1,  Z2\ 

Output:  Vi,  Ft, 

Read  the  first  data  sample  x-\ ; 

Initialize  the  first  cluster; 
for  k  =  2,3, ...  do 
Read  xk\ 


Compute  Dfk  for  all  clusters; 
Identify  the  closest  cluster; 

lf  Dlk  <  Xl.p  then 

Update  the  closest  cluster; 


else 


Update  the  dissimilarity  number  M^is; 
Compute  p  and  s; 

if  P  F  s  <C  Pmin  F  s min  then 
Update  pmin  and 

Smini 

end  if 

if  P  F  S  Pmin  F  Z\  *  Smin  then 

Store  xk  in  the  data  window  W ( data)j\ 

end  if 

if  P  F  S  A  Pmin  F  Z.2  *  Smin  then 

Create  new  cluster; 

end  if 
end  if 


Compute  Dfj  and  Dj,  for  all  clusters; 

if  Dlj  <  xlji  or  D‘X  <  xtji  then 

Merge  redundant  clusters; 

end  if 
end  for 


context,  the  problem  of  pattern  classihcation  consists  in  as¬ 
signing  a  class  or  a  category  for  each  data  sample  from  a  set 
of  “raw”  data  (Duda  et  al.,  2001).  Pattern  classihcation  algo¬ 
rithms  based  on  fuzzy  rules  have  been  used  in  many  applica¬ 
tions  due  to  their  advantages  in  relation  to  classic  algorithms 
for  pattern  classihcation,  especially  by  the  good  prediction 
performance  in  real  problems  and  good  transparency  in  lin¬ 
guistic  rules  (Jang,  Sun,  &  Mizutani,  1997),  which  allows 
an  easy  comprehension  of  the  dependence  between  pattern 
characteristics.  The  typical  architecture  of  a  fuzzy  classiher 
consists  of  a  set  of  IF  ...  THEN  fuzzy  rules,  defined  as: 


RULE;  :  IF  x-\  IS  n,  \  AND  ...  AND  xn  IS  //,„  THEN  y,  =  Li 

(24) 

where  [xki, ... ,Xkn ]  are  the  input  variables  or  patterns  of  n 
dimensionality;  [p-n, ...,  p.in ]  are  antecedent  fuzzy  sets  of  the 
ith  fuzzy  rule;  y,  is  the  output;  L,  is  the  crisp  output  corre¬ 
sponding  to  the  class  label  from  the  set  [1, ...,  A'],  where  K 
is  the  number  of  classes.  For  each  new  input  data  sample  xk, 
the  classihcation  is  obtained  by  assigning  to  it  the  label  of  the 
class  associated  with  the  rule  having  the  highest  activation 
degree.  The  class  is  determined  as  follows: 

Vi  =  (25) 


where  i*  =  arg  maxfr,);  R  is  the  number  of  fuzzy  rules  and 

1  <i<R 

Ti  is  the  activation  degree  of  the  ;th  fuzzy  rule,  defined  by  a 
t-norm,  usually  expressed  as  a  product  operator: 

n 

n=  T  pij(xj)  (26) 

j=i 


where  /i,_,  are  the  membership  functions  of  fuzzy  sets  dehned 
by  Gaussians: 


(27) 


where  Vij  and  rrf-  represent  respectively  the  membership  func¬ 
tions  center  and  variance.  Usually,  to  implement  this  fuzzy 
classiher  architecture,  clustering  is  performed  in  the  input 
and/or  output  space.  Then,  rules  are  created  using  one-dimen¬ 
sional  (or  univariate)  fuzzy  sets,  generated  from  the  projec¬ 
tion  of  the  clusters  in  the  axis  of  each  variable.  According 
to  Lemos  et  al.  (201 1),  this  approach  can  lead  to  information 
loss  if  there  is  interaction  between  variables,  and  to  avoid  this, 
the  authors  propose  the  use  of  multivariate  Gaussian  member¬ 
ship  functions  to  represent  antecedent  fuzzy  sets  of  each  rule. 
These  membership  functions  are  described  as: 


H(x)  =  e_K(x-®)E  1(-X~V)T)  (28) 


where  v  is  a  1  x  n  central  vector  and  E  is  a  n  x  n  symmet¬ 
ric  positive  definite  matrix.  The  central  vector  is  dehned  as 
the  modal  value  and  represents  H(x)  typical  value  and  the  E 
matrix  denotes  the  dispersion  and  represents  H (x)  spreading. 
Thus,  each  cluster  found  by  the  clustering  algorithm  is  asso¬ 
ciated  with  a  fuzzy  rule  and  the  multivariate  Gaussian  mem¬ 
bership  function  parameters  are  dehned  as  the  parameters  of 
the  corresponding  cluster.  If  multivariate  Gaussian  member¬ 
ship  functions  are  used,  the  fuzzy  classiher  will  have  a  rule 
set  dehned  as: 


RULE,  :  IF  xk  IS  At  THEN  Vi  =  Lt  (29) 

where  A,;  is  the  fuzzy  set  with  multivariate  Gaussian  member¬ 
ship  function  of  the  ;th  fuzzy  rule,  with  parameters  extracted 
from  the  corresponding  cluster.  In  general,  more  than  one 
rule  can  be  used  to  describe  a  class,  e.g,  the  class  can  be  mul¬ 
timodal.  In  this  case,  only  one  rule  cannot  be  sufficient  to 
describe  all  possible  variations  of  the  same  class.  Thus,  the 
fuzzy  classiher  aggregates  rules  outputs  associated  with  the 
same  class  using  a  s-norm.  The  result  of  the  aggregation  can 
be  interpreted  like  rules  as  follows: 

(IFaifclS  Ai)OR(IFa;fcIS  Aj)OR...(IFxfeISAfc)THENj/j  =  A, 

(30) 

This  aggregation  results  in  the  degree  of  relevance  of  each 
known  class.  The  classihcation  of  each  new  sample  xk  is 
dehned  by  the  class  with  the  highest  relevance  degree. 

Data  samples  classes  are  not  known  a  priori  in  some  pattern 
classihcation  applications.  In  these  situations  it  is  required 
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the  use  of  an  unsupervised  learning  process  for  classifier  im¬ 
plementation.  Moreover,  in  applications  where  the  pattern 
classification  should  be  performed  in  real  time,  the  learning 
should  be  performed  using  incremental  algorithms,  process¬ 
ing  each  data  sample  once  as  a  data  stream.  To  solve  these 
problems,  the  solution  is  to  use  a  recursive  clustering  algo¬ 
rithm.  We  propose  in  this  paper  an  evolving  fuzzy  classifier 
based  on  recursive  clustering  algorithm  with  drift  detection 
presented  in  Section  2.3,  which  allows  the  creation  of  a  fuzzy 
rule  base  in  online  mode  and,  if  necessary,  in  real  time  from 
input  data  samples.  This  approach  is  different  from  the  ones 
employed  in  traditional  fuzzy  classifiers,  which  require  some 
training  (usually  supervised)  conducted  in  off-line  mode.  For 
rule  base  update,  the  proposed  evolving  fuzzy  classifier  uses 
the  output  of  the  recursive  clustering  algorithm  described  in 
the  previous  section.  For  each  new  input  data  sample,  if  a  new 
cluster  is  created,  a  new  fuzzy  rule  given  by  Eq.  (29)  is  added 
to  the  rule  base,  where  the  cluster  parameters  are  used  as  pa¬ 
rameters  of  the  multivariable  Gaussian  membership  function 
of  the  antecedents.  The  rule  consequent  (the  crisp  output  cor¬ 
responding  to  the  class  label)  must  be  defined  by  experts  or 
system  operators,  since  in  unsupervised  learning  processes 
incoming  online  samples  usually  are  not  pre-labelled.  If  a 
cluster  is  updated,  the  corresponding  class  label  is  determined 
as  the  consequent  of  the  fuzzy  rule  with  the  highest  activation 
degree,  and  the  user  intervention  is  not  necessary.  If  two  clus¬ 
ters  are  merged  by  the  recursive  clustering  algorithm,  the  cor¬ 
responding  fuzzy  rules  are  also  merged  to  represent  an  unique 
class.  It  should  be  noted  that,  both  the  number  of  rules  and 
the  number  of  classes  are  determined  during  the  evolving  pro¬ 
cess,  and  it  is  not  necessary  to  set  these  parameters  a  priori. 
Algorithm  2  summarizes  the  procedures  of  the  classifier. 


Algorithm  2:  Evolving  Fuzzy  Classifier 

Input:  Xk\ 

Output:  yk\ 

Initialize  the  classifier; 
for  k  =  1,2, ...  do 
Read  xk\ 

Execute  the  reclusive  clustering  algorithm  with  drift 
detection; 

if  new  cluster  is  created  then 
Create  new  fuzzy  rule; 

Define  the  new  class  elicited  by  expert  /  system 
operator; 

Uk  =  label  of  the  new  class; 

end  if 

if  cluster  is  updated  then 

Update  the  corresponding  fuzzy  rule; 

Find  the  most  active  rule; 

yk  =  label  of  the  most  active  rule; 

end  if 

if  clusters  are  merged  then 

Merge  the  corresponding  fuzzy  rules; 

end  if 
end  for 


Output 


Figure  1 .  Fault  diagnosis  with  the  evolving  fuzzy  classifier. 


Figure  1  illustrates  the  application  of  the  proposed  classifier 
for  fault  diagnosis.  Data  samples  are  obtained  from  a  dy¬ 
namic  system  in  a  continuous  stream,  usually  provided  by 
sensors  that  monitor  the  process.  These  data  might  require  the 
use  of  pre-processing  techniques  for  feature  extraction.  The 
rule  set  of  the  classifier  starts  empty  at  the  beginning.  Rules 
are  created  as  the  recursive  clustering  algorithm  creates  clus¬ 
ters  to  represent  the  data  stream.  Each  rule  will  be  related  to 
a  class,  and  each  class  will  be  related  to  a  dynamic  system 
condition,  representing  a  normal  operation  or  a  fault.  When 
a  new  rule  is  created,  the  system  operator  is  notified  and  in¬ 
forms  the  label  of  the  new  class  that  defines  it  as  a  normal 
operation  condition  or  as  a  specific  fault.  All  of  the  necessary 
diagnostic  information,  the  fuzzy  rules  and  classes  label,  are 
stored  in  a  unified  database  and  updated  while  the  system  is 
used.  The  classifier  database  will  contain  a  set  of  fuzzy  rules 
and  classes  labels  defined  after  an  initial  period  of  operation. 
When  a  new  data  sample  is  associated  with  an  existing  clus¬ 
ter,  the  classifier  updates  the  corresponding  fuzzy  rule  and 
classifies  the  dynamic  system  condition  as  the  label  present 
in  the  consequent  of  the  fuzzy  rule  with  the  highest  activation 
degree.  It  should  be  noted  that,  in  this  situation,  user  inter¬ 
vention  is  not  required,  and  the  classification  of  the  dynamic 
system  condition  is  performed  automatically.  The  main  fea¬ 
ture  of  the  classifier  proposed  in  this  work  is  ability  to  diag¬ 
nose  faults  in  a  complex  non-stationary  dynamic  system  in 
online  mode  and,  if  necessary,  in  real  time.  The  classifier 
does  not  require  any  a  priori  information  about  the  dynamic 
model  neither  process  historical  data.  This  allows  the  classi¬ 
fier  to  construct  a  rule  base  in  an  evolving  way  and,  with  the 
aid  of  the  operator,  to  learn  to  diagnose  faults  as  they  occur. 
Thus,  the  proposed  classifier  is  able  to  adapt  to  the  dynamic 
system,  making  it  possible  to  diagnose  faults  not  previously 
known. 

4.  Simulations  and  Results 

The  proposed  classifier  was  evaluated  for  fault  diagnosis  in  an 
interacting  tank  system.  The  interacting  tank  system  model 
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employed  in  this  work  was  based  in  the  system  proposed  by 
Braga,  Jota,  Polito,  and  Pena  (1995)  and  allows  to  simulate 
faults  that  resembles  the  faults  of  real  industrial  plants.  As 
illustrated  in  Fig.  2,  the  system  comprises  of  a  reservoir  (TQ- 
1)  and  two  passively  interconnected  tanks  (TQ-2  and  TQ-3). 
Using  the  interacting  tank  system  model  is  possible  to  per¬ 
form  fault  simulation  on  the  actuators  (pneumatic  valves  and 
pumps),  at  the  system  components  (connection  pipes  between 
tanks)  and  on  the  sensors,  with  different  sets  of  parameters. 
The  types  of  faults  are  detailed  in  Table  1.  In  the  fault  sim¬ 
ulation,  the  system  starts  at  normal  operation,  and  a  fault  is 
set  at  half  of  the  simulation  interval.  Figure  3  shows  as  an 
example  the  curves  of  the  TQ-2  level,  TQ-3  level,  TQ-2  input 
flow  rate  and  TQ-3  output  flow  rate  in  fault  simulation  (FCV- 
1  valve  tightness).  At  the  beginning  of  each  simulation,  the 
system  is  working  under  normal  operation,  and  the  fault  starts 
at  the  half  of  the  period. 


Figure  2.  Representation  of  the  interacting  tank  system. 


Table  1 .  Types  of  faults  on  interacting  tank  system. 


Index 

Description 

0 

Normal  operation 

1 

FCV-1  valve  tightness 

2 

FCV-2  valve  tightness 

3 

BA- 1  pump  shutdown 

4 

BA-2  pump  shutdown 

5 

pipe  clogging  between  TQ-1  and  TQ-2 

6 

pipe  clogging  between  TQ-1  and  TQ-3 

7 

pipe  clogging  between  TQ-2  and  TQ-3 

8 

pipe  leakage  between  TQ-2  and  TQ-3 

9 

TQ-3  level  sensor  fault 

10 

TQ-3  output  flow  rate  sensor  fault 

11 

TQ-2  input  flow  rate  sensor  fault 

Different  scenarios  were  used  in  the  fault  diagnosis  experi¬ 
ments.  Each  scenario  consists  in  the  simulation  of  sequences 
from  3  to  11  randomly  selected  fault  types  within  a  set  of 


t(s) 


Figure  3.  Fault  Simulation:  FCV-1  valve  tightness. 


faults  with  periods  of  normal  operation  between  faults.  In  or¬ 
der  to  assess  the  robustness  of  the  proposed  classifier  to  the 
presence  of  noise  in  the  data,  for  each  monitored  variable  ran¬ 
dom  Gaussian  noise  was  added  with  a  zero  mean  and  standard 
deviation  equal  to  1%  of  the  variable  nominal  value,  consid¬ 
ering  normal  operation  of  the  system.  As  inputs  of  the  clas¬ 
sifier  were  provided  in  an  online  mode  data  samples  related 
to  monitored  variables  of  the  interacting  tank  system:  TQ-2 
level,  TQ-3  level,  TQ-2  input  flow  rate  and  TQ-3  output  flow 
rate.  For  each  fault  sequence,  the  output  classifier  was  com¬ 
pared  to  the  sequence  provided.  Whereas  the  classifier  starts 
with  no  fuzzy  rule  set,  the  first  samples  of  data  should  match 
the  normal  operation  of  the  system,  i.e.,  the  first  rule  created 
to  describe  the  normal  operation.  For  the  experiments,  the 
parameters  of  the  recursive  clustering  algorithm  were  defined 
as:  Xn,/3  =  9.4877;  Finit  =  10 ~2/;  ainit  =  0.5;  zi  = 
2;  z2  =  3. 

Figure  4  show  as  an  example  the  results  of  fault  diagnosis 
in  5  faults  scenario  simulated  scenario,  where  we  can  com¬ 
pare  the  estimated  output  (classified  faults  sequence)  of  the 
proposed  classifier  with  the  desired  output  (selected  faults  se¬ 
quence)  from  input  data  samples.  Results  show  that  the  clas¬ 
sifier  was  able  to  correctly  diagnose  all  the  interacting  tank 
system  faults.  Whereas  the  presence  of  noise  in  the  data  sam¬ 
ples,  the  occurrence  of  false  alarms  or  misclassification  (rep¬ 
resented  by  isolated  points  on  the  graph)  is  low,  even  in  the 
scenario  with  the  highest  number  of  possible  faults. 

The  classifier  performance  evaluation  in  this  work  was  held  in 
terms  of  faults  detection  and  fault  classification,  as  suggested 
in  Vachtsevanos  et  al.  (2006).  Three  metrics  were  calculated 
in  fault  detection  evaluation:  Probability  of  Detection  (POD), 
Probability  of  False  Alarm  (POFA)  and  Accuracy  (ACC).  Re¬ 
garding  fault  classification  evaluation,  the  metric  Fault  Isola¬ 
tion  Rate  (FIR)  was  used.  Other  metrics  that  were  used  to  as¬ 
sess  the  performance  of  the  proposed  classifier  are:  Detection 


398 


Annual  Conference  of  the  Prognostics  and  Health  Management  Society  2014 


Figure  4.  Desired  output  and  estimated  output  by  proposed 
classifier  in  5  faults  scenario. 


Delay  Time  (DDT),  Isolation  Delay  Time  (IDT)  and  Opera¬ 
tor  Intervention  Rate  (OIR).  All  results  of  fault  diagnosis  ex¬ 
periments  with  interacting  tank  system  obtained  by  classifier 
proposed  in  this  work  were  compared  to  the  results  obtained 
using  the  evolving  fuzzy  classifier  proposed  by  Lemos  et  al. 
(2013).  For  the  experiments,  the  parameters  of  this  alterna¬ 
tive  classifier  were  set  to:  w  =  100,  A  =  0.001,  a  =  0.01, 
Ty,  =  0.01.  According  to  authors,  this  combination  has  been 
found  experimentally  to  provide  a  good  balance  between  the 
false  alarm  rate  and  the  sensibility  of  the  fault  detection  and 
diagnostic  approach. 

Table  2  summarizes  the  results  for  both  classifiers  using  the 
fault  detection  metrics  described.  The  results  show  that  the 
classifier  proposed  in  this  work  has  higher  levels  of  fault  de¬ 
tection  rates  and  accuracy  in  all  scenarios,  and  no  occurrence 
of  false  alarm.  These  results  prove  the  efficiency  of  the  al¬ 
gorithm  in  detecting  simulated  faults  in  the  interacting  tank 
system.  Despite  its  lower  fault  detection  rates  and  lower  ac¬ 
curacy,  the  classifier  proposed  by  Lemos  et  al.  (2013)  also  not 
showed  any  false  alarms. 

Table  3  summarizes  the  results  for  both  classifiers  using  the 
faults  classification  metrics  described.  The  results  show  that 
the  classifier  proposed  in  this  work  presented  higher  fault  iso¬ 
lation  rate  in  all  scenarios.  In  all  scenarios  the  operator  inter¬ 
vention  on  faults  classification  was  very  low.  These  results 
shows  the  ability  of  the  classifier  to  automatically  diagnose 
almost  all  faults  after  the  first  occurrence,  and  it  also  reveals 
their  ability  to  learn.  Note  that,  in  general,  the  classifier  pro¬ 
posed  by  (Lemos  et  al.,  2013)  had  a  lower  performance  in 


faults  classification  than  the  proposed  classifier  and  it  needed 
more  operator  interventions. 

Table  4  summarizes  the  results  for  both  classifiers  using  the 
time  metrics  in  fault  detection  and  classification.  A  compar¬ 
ison  between  the  average  values  for  fault  detection  time  and 
fault  isolation  time  demonstrates  that  faults  classification  is 
faster  after  the  first  occurrence  of  each  type  of  fault,  since 
the  classifier  database  already  has  the  fuzzy  rules  and  labels 
for  all  types  of  detected  faults,  not  requiring  an  operator  in¬ 
tervention.  The  results  of  the  experiments  with  the  classi¬ 
fier  proposed  by  Lemos  et  al.  (2013)  demonstrated  a  faster 
response  than  the  classifier  proposed  in  this  work,  which  is 
related  to  different  update  mechanisms  in  the  clustering  algo¬ 
rithms  used  in  each  one  of  the  classifiers. 

Table  2.  Faults  detection  performance. 


Scenario 

Proposed  | 

POD (%) 

POFA  (%) 

ACC  (%) 

3  faults 

99.38 

0.00 

99.67 

5  faults 

99.25 

0.00 

99.63 

7  faults 

99.53 

0.00 

99.67 

9  faults 

99.12 

0.00 

99.56 

1 1  faults 

99.20 

0.00 

99.60 

Scenario 

Lemos  et  al.  (2013)  | 

POD  (%) 

POFA  (%) 

ACC  (%) 

3  faults 

8935 

um 

94.67 

5  faults 

83.04 

0.00 

91.75 

7  faults 

82.27 

0.00 

91.10 

9  faults 

79.78 

0.00 

89.89 

1 1  faults 

76.02 

0.00 

88.01 

Table  3.  Faults  classification  performance. 


Scenario 

Proposed 

Lemos  et  al.  (2013)  | 

FIR  (%) 

OIR  (%) 

FIR  (%) 

OIR  (%) 

3  faults 

99.55 

0.05 

94.67 

0.28 

5  faults 

96.76 

0.04 

91.88 

0.29 

7  faults 

94.24 

0.03 

90.30 

0.30 

9  faults 

92.69 

0.03 

89.86 

0.31 

11  faults 

91.43 

0.03 

88.01 

0.31 

Table  4.  Fault  detection  and  classification  time. 


Scenario 

Proposed 

Lemos  et  al.  (2013)  | 

DDT  (s) 

IDT  (s) 

DDT  (s) 

IDT  (s) 

3  faults 

0.065 

0.003 

0315 

0.003 

5  faults 

0.753 

0.680 

0.017 

0.003 

7  faults 

1.482 

1.321 

0.021 

0.004 

9  faults 

1.936 

1.826 

0.018 

0.004 

1 1  faults 

2.327 

2.204 

0.018 

0.004 

To  evaluate  the  robustness  of  the  proposed  classifier  in  the 
presence  of  outliers  in  the  data,  another  experiment  was  con¬ 
ducted.  In  this  experiment,  a  5  faults  scenario  was  simulated. 
Outliers  were  inserted  in  the  data  samples,  i.e.,  some  sam¬ 
ples  were  corrupted  with  high  variance  noise.  Even  in  the 
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presence  of  outliers,  the  fault  diagnosis  results  for  this  ex¬ 
periment  shows  that  the  proposed  classifier  was  able  to  cor¬ 
rectly  detect  and  diagnose  all  faults  considered.  This  result 
shows  that  the  classifier  was  able  to  correctly  distinguish  be¬ 
tween  outliers  and  valid  data  samples.  The  results  of  this  ex¬ 
periment  are  presented  in  Table  5  and  Table  6.  Analysing 
these  tables,  one  can  note  that  the  proposed  classifier  has  vir¬ 
tually  the  same  performance  in  fault  diagnosis  with  absence 
or  presence  of  outliers,  and  also  not  showed  occurrence  of 
false  alarm.  This  experiment  showed  the  greater  robustness 
of  the  classifier  proposed  in  this  work  when  compared  with 
the  classifier  proposed  by  Lemos  et  al.  (2013),  since  the  latter 
showed  major  differences  in  fault  detection  and  fault  classifi¬ 
cation  rates  in  scenarios  with  and  without  outliers. 

Table  5.  Faults  detection  performance  with  outliers. 


Scenario 

Proposed 

POD  (%)  POPA  (%)  ACC  (%) 

without  outliers 
with  outliers 

99H3  OHO  99H3 

99.26  0.00  99.63 

Scenario 

Lemos  et  al.  (2013) 

POD  (%)  POPA  (%)  ACC  (%) 

without  outliers 
with  outliers 

83778  OHO  91775 

79.00  0.00  89.51 

Table  6.  Fault  classification  performance  with  outliers. 


Scenario 

Proposed 

Lemos  et  al.  (2013) 

FIR  (%)  OIR  (%) 

FIR  (%)  OIR  (%) 

without  outliers 
with  outliers 

96.73  0.04 

96.34  0.04 

91.88  0.30 

89.00  0.32 

5.  Conclusion 

An  evolving  fuzzy  classifier  for  fault  diagnosis  of  dynamic 
systems  was  presented  in  this  work.  The  proposed  classi¬ 
fier  is  composed  by  a  set  of  fuzzy  rules  created  and  updated 
based  on  recursive  clustering  algorithm.  A  new  mechanism 
for  cluster  updating  based  on  a  drift  detection  method  is  em¬ 
ployed,  where  the  update  of  the  cluster  depends  not  only  of 
the  similarity  measure,  but  also  on  the  data  context  monitor¬ 
ing.  As  suggested  by  the  simulation  results,  this  feature  gives 
the  proposed  classifier  robustness  to  outliers  and  noise.  An 
interacting  tank  system  model  was  used  for  evaluation  of  the 
classifier  proposed  in  this  work.  The  classifier  was  able  to 
detect  and  classify  all  faults  with  a  high  performance,  even 
in  the  presence  of  outliers  and  noise.  The  high  fault  isola¬ 
tion  rate  and  low  false  alarm  rate  obtained  in  all  simulated 
scenarios  showed  that  the  recursive  clustering  algorithm  with 
drift  detection  method  was  able  to  efficiently  distinguish  data 
samples  representing  clusters  of  invalid  data.  Moreover,  the 
proposed  classifier  was  able  to  automatically  diagnose  almost 
all  faults,  requiring  operator  intervention  on  a  small  percent¬ 


age  of  cases.  This  demonstrates  the  advantage  of  the  con¬ 
tinuous  and  incremental  learning  of  the  classifier  over  other 
classifiers  that  require  retraining  whenever  an  unknown  type 
of  fault  is  found.  The  classifier  proposed  in  this  work  has  as 
advantages:  the  ability  to  learn  from  faults  in  online  mode  and 
in  real  time;  the  ability  to  adapt  to  cope  with  changes  in  the 
dynamic  system;  and  robustness  to  the  presence  of  outliers 
and  noise  in  the  input  data.  Summarizing,  the  proposed  clas¬ 
sifier  has  showed  to  be  a  promising  alternative  for  application 
in  fault  diagnosis  where  other  methods  prove  to  be  inefficient 
or  less  advantageous,  because  of  the  characteristics  of  such 
systems.  In  a  future  work,  we  will  investigate  the  application 
of  the  proposed  algorithm  in  the  real  time  fault  diagnosis  and 
prognosis  of  industrial  machines  and  equipments. 
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